{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "intro_to_neural_nets.ipynb",
      "version": "0.3.2",
      "views": {},
      "default_view": {},
      "provenance": [],
      "collapsed_sections": [
        "O2q5RRCKqYaU",
        "vvT2jDWjrKew",
        "copyright-notice"
      ]
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "metadata": {
        "id": "copyright-notice",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "#### Copyright 2017 Google LLC."
      ]
    },
    {
      "metadata": {
        "cellView": "both",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        },
        "id": "copyright-notice2",
        "colab_type": "code"
      },
      "outputs": [],
      "cell_type": "code",
      "source": [
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": 0
    },
    {
      "metadata": {
        "id": "eV16J6oUY-HN",
        "colab_type": "text",
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": [
        " # Introduction aux r\u00e9seaux de neurones"
      ]
    },
    {
      "metadata": {
        "id": "_wIcUFLSKNdx",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " **Objectifs d'apprentissage\u00a0:**\n",
        "  * D\u00e9finir un r\u00e9seau de neurones et ses couches cach\u00e9es \u00e0 l'aide de la classe `DNNRegressor` de TensorFlow\n",
        "  * Entra\u00eener un r\u00e9seau de neurones \u00e0 apprendre des non-lin\u00e9arit\u00e9s dans un ensemble de donn\u00e9es et \u00e0 \u00eatre plus performant qu'un mod\u00e8le de r\u00e9gression lin\u00e9aire"
      ]
    },
    {
      "metadata": {
        "id": "_ZZ7f7prKNdy",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Dans les exercices pr\u00e9c\u00e9dents, des caract\u00e9ristiques synth\u00e9tiques ont \u00e9t\u00e9 utilis\u00e9es pour permettre au mod\u00e8le d'incorporer des non-lin\u00e9arit\u00e9s.\n",
        "\n",
        "Un ensemble de non-lin\u00e9arit\u00e9s important concernait la latitude et la longitude, mais il peut y en avoir d'autres.\n",
        "\n",
        "Nous allons revenir, pour l'instant, \u00e0 une t\u00e2che de r\u00e9gression standard plut\u00f4t qu'\u00e0 la t\u00e2che de r\u00e9gression logistique de l'exercice pr\u00e9c\u00e9dent. En d'autres termes, nous allons pr\u00e9dire directement la valeur m\u00e9diane d'un logement (`median_house_value`)."
      ]
    },
    {
      "metadata": {
        "id": "J2kqX6VZTHUy",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Configuration\n",
        "\n",
        "Commencez par charger et pr\u00e9parer les donn\u00e9es."
      ]
    },
    {
      "metadata": {
        "id": "AGOM1TUiKNdz",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "from __future__ import print_function\n",
        "\n",
        "import math\n",
        "\n",
        "from IPython import display\n",
        "from matplotlib import cm\n",
        "from matplotlib import gridspec\n",
        "from matplotlib import pyplot as plt\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "from sklearn import metrics\n",
        "import tensorflow as tf\n",
        "from tensorflow.python.data import Dataset\n",
        "\n",
        "tf.logging.set_verbosity(tf.logging.ERROR)\n",
        "pd.options.display.max_rows = 10\n",
        "pd.options.display.float_format = '{:.1f}'.format\n",
        "\n",
        "california_housing_dataframe = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv\", sep=\",\")\n",
        "\n",
        "california_housing_dataframe = california_housing_dataframe.reindex(\n",
        "    np.random.permutation(california_housing_dataframe.index))"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "2I8E2qhyKNd4",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def preprocess_features(california_housing_dataframe):\n",
        "  \"\"\"Prepares input features from California housing data set.\n",
        "\n",
        "  Args:\n",
        "    california_housing_dataframe: A Pandas DataFrame expected to contain data\n",
        "      from the California housing data set.\n",
        "  Returns:\n",
        "    A DataFrame that contains the features to be used for the model, including\n",
        "    synthetic features.\n",
        "  \"\"\"\n",
        "  selected_features = california_housing_dataframe[\n",
        "    [\"latitude\",\n",
        "     \"longitude\",\n",
        "     \"housing_median_age\",\n",
        "     \"total_rooms\",\n",
        "     \"total_bedrooms\",\n",
        "     \"population\",\n",
        "     \"households\",\n",
        "     \"median_income\"]]\n",
        "  processed_features = selected_features.copy()\n",
        "  # Create a synthetic feature.\n",
        "  processed_features[\"rooms_per_person\"] = (\n",
        "    california_housing_dataframe[\"total_rooms\"] /\n",
        "    california_housing_dataframe[\"population\"])\n",
        "  return processed_features\n",
        "\n",
        "def preprocess_targets(california_housing_dataframe):\n",
        "  \"\"\"Prepares target features (i.e., labels) from California housing data set.\n",
        "\n",
        "  Args:\n",
        "    california_housing_dataframe: A Pandas DataFrame expected to contain data\n",
        "      from the California housing data set.\n",
        "  Returns:\n",
        "    A DataFrame that contains the target feature.\n",
        "  \"\"\"\n",
        "  output_targets = pd.DataFrame()\n",
        "  # Scale the target to be in units of thousands of dollars.\n",
        "  output_targets[\"median_house_value\"] = (\n",
        "    california_housing_dataframe[\"median_house_value\"] / 1000.0)\n",
        "  return output_targets"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "pQzcj2B1T5dA",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "# Choose the first 12000 (out of 17000) examples for training.\n",
        "training_examples = preprocess_features(california_housing_dataframe.head(12000))\n",
        "training_targets = preprocess_targets(california_housing_dataframe.head(12000))\n",
        "\n",
        "# Choose the last 5000 (out of 17000) examples for validation.\n",
        "validation_examples = preprocess_features(california_housing_dataframe.tail(5000))\n",
        "validation_targets = preprocess_targets(california_housing_dataframe.tail(5000))\n",
        "\n",
        "# Double-check that we've done the right thing.\n",
        "print(\"Training examples summary:\")\n",
        "display.display(training_examples.describe())\n",
        "print(\"Validation examples summary:\")\n",
        "display.display(validation_examples.describe())\n",
        "\n",
        "print(\"Training targets summary:\")\n",
        "display.display(training_targets.describe())\n",
        "print(\"Validation targets summary:\")\n",
        "display.display(validation_targets.describe())"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "RWq0xecNKNeG",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Construction d'un r\u00e9seau de neurones\n",
        "\n",
        "Le r\u00e9seau de neurones est d\u00e9fini par la classe [DNNRegressor](https://www.tensorflow.org/api_docs/python/tf/estimator/DNNRegressor).\n",
        "\n",
        "Utilisez **`hidden_units`** pour d\u00e9finir la structure du r\u00e9seau de neurones. L'argument `hidden_units` fournit une liste d'entiers\u00a0; chaque entier correspond \u00e0 une couche cach\u00e9e et indique le nombre de n\u0153uds qu'il contient. Prenons comme exemple l'affectation suivante\u00a0:\n",
        "\n",
        "`hidden_units=[3,10]`\n",
        "\n",
        "L'affectation ci-dessus sp\u00e9cifie un r\u00e9seau de neurones avec deux couches cach\u00e9es\u00a0:\n",
        "\n",
        "* La premi\u00e8re couche cach\u00e9e contient 3\u00a0n\u0153uds.\n",
        "* La deuxi\u00e8me couche cach\u00e9e contient 10\u00a0n\u0153uds.\n",
        "\n",
        "Pour ajouter des couches, vous devez ajouter des entiers dans la liste. Par exemple, `hidden_units=[10,20,30,40]` cr\u00e9era quatre couches comportant respectivement dix, vingt, trente et quarante unit\u00e9s.\n",
        "\n",
        "Par d\u00e9faut, toutes les couches cach\u00e9es utiliseront l'activation ReLU et seront enti\u00e8rement connect\u00e9es."
      ]
    },
    {
      "metadata": {
        "id": "ni0S6zHcTb04",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def construct_feature_columns(input_features):\n",
        "  \"\"\"Construct the TensorFlow Feature Columns.\n",
        "\n",
        "  Args:\n",
        "    input_features: The names of the numerical input features to use.\n",
        "  Returns:\n",
        "    A set of feature columns\n",
        "  \"\"\" \n",
        "  return set([tf.feature_column.numeric_column(my_feature)\n",
        "              for my_feature in input_features])"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "zvCqgNdzpaFg",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def my_input_fn(features, targets, batch_size=1, shuffle=True, num_epochs=None):\n",
        "    \"\"\"Trains a neural net regression model.\n",
        "  \n",
        "    Args:\n",
        "      features: pandas DataFrame of features\n",
        "      targets: pandas DataFrame of targets\n",
        "      batch_size: Size of batches to be passed to the model\n",
        "      shuffle: True or False. Whether to shuffle the data.\n",
        "      num_epochs: Number of epochs for which data should be repeated. None = repeat indefinitely\n",
        "    Returns:\n",
        "      Tuple of (features, labels) for next data batch\n",
        "    \"\"\"\n",
        "    \n",
        "    # Convert pandas data into a dict of np arrays.\n",
        "    features = {key:np.array(value) for key,value in dict(features).items()}                                             \n",
        " \n",
        "    # Construct a dataset, and configure batching/repeating.\n",
        "    ds = Dataset.from_tensor_slices((features,targets)) # warning: 2GB limit\n",
        "    ds = ds.batch(batch_size).repeat(num_epochs)\n",
        "    \n",
        "    # Shuffle the data, if specified.\n",
        "    if shuffle:\n",
        "      ds = ds.shuffle(10000)\n",
        "    \n",
        "    # Return the next batch of data.\n",
        "    features, labels = ds.make_one_shot_iterator().get_next()\n",
        "    return features, labels"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "U52Ychv9KNeH",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def train_nn_regression_model(\n",
        "    learning_rate,\n",
        "    steps,\n",
        "    batch_size,\n",
        "    hidden_units,\n",
        "    training_examples,\n",
        "    training_targets,\n",
        "    validation_examples,\n",
        "    validation_targets):\n",
        "  \"\"\"Trains a neural network regression model.\n",
        "  \n",
        "  In addition to training, this function also prints training progress information,\n",
        "  as well as a plot of the training and validation loss over time.\n",
        "  \n",
        "  Args:\n",
        "    learning_rate: A `float`, the learning rate.\n",
        "    steps: A non-zero `int`, the total number of training steps. A training step\n",
        "      consists of a forward and backward pass using a single batch.\n",
        "    batch_size: A non-zero `int`, the batch size.\n",
        "    hidden_units: A `list` of int values, specifying the number of neurons in each layer.\n",
        "    training_examples: A `DataFrame` containing one or more columns from\n",
        "      `california_housing_dataframe` to use as input features for training.\n",
        "    training_targets: A `DataFrame` containing exactly one column from\n",
        "      `california_housing_dataframe` to use as target for training.\n",
        "    validation_examples: A `DataFrame` containing one or more columns from\n",
        "      `california_housing_dataframe` to use as input features for validation.\n",
        "    validation_targets: A `DataFrame` containing exactly one column from\n",
        "      `california_housing_dataframe` to use as target for validation.\n",
        "      \n",
        "  Returns:\n",
        "    A `DNNRegressor` object trained on the training data.\n",
        "  \"\"\"\n",
        "\n",
        "  periods = 10\n",
        "  steps_per_period = steps / periods\n",
        "  \n",
        "  # Create a DNNRegressor object.\n",
        "  my_optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\n",
        "  my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n",
        "  dnn_regressor = tf.estimator.DNNRegressor(\n",
        "      feature_columns=construct_feature_columns(training_examples),\n",
        "      hidden_units=hidden_units,\n",
	"      optimizer=my_optimizer\n",
        "  )\n",
        "  \n",
        "  # Create input functions.\n",
        "  training_input_fn = lambda: my_input_fn(training_examples, \n",
        "                                          training_targets[\"median_house_value\"], \n",
        "                                          batch_size=batch_size)\n",
        "  predict_training_input_fn = lambda: my_input_fn(training_examples, \n",
        "                                                  training_targets[\"median_house_value\"], \n",
        "                                                  num_epochs=1, \n",
        "                                                  shuffle=False)\n",
        "  predict_validation_input_fn = lambda: my_input_fn(validation_examples, \n",
        "                                                    validation_targets[\"median_house_value\"], \n",
        "                                                    num_epochs=1, \n",
        "                                                    shuffle=False)\n",
        "\n",
        "  # Train the model, but do so inside a loop so that we can periodically assess\n",
        "  # loss metrics.\n",
        "  print(\"Training model...\")\n",
        "  print(\"RMSE (on training data):\")\n",
        "  training_rmse = []\n",
        "  validation_rmse = []\n",
        "  for period in range (0, periods):\n",
        "    # Train the model, starting from the prior state.\n",
        "    dnn_regressor.train(\n",
        "        input_fn=training_input_fn,\n",
        "        steps=steps_per_period\n",
        "    )\n",
        "    # Take a break and compute predictions.\n",
        "    training_predictions = dnn_regressor.predict(input_fn=predict_training_input_fn)\n",
        "    training_predictions = np.array([item['predictions'][0] for item in training_predictions])\n",
        "    \n",
        "    validation_predictions = dnn_regressor.predict(input_fn=predict_validation_input_fn)\n",
        "    validation_predictions = np.array([item['predictions'][0] for item in validation_predictions])\n",
        "    \n",
        "    # Compute training and validation loss.\n",
        "    training_root_mean_squared_error = math.sqrt(\n",
        "        metrics.mean_squared_error(training_predictions, training_targets))\n",
        "    validation_root_mean_squared_error = math.sqrt(\n",
        "        metrics.mean_squared_error(validation_predictions, validation_targets))\n",
        "    # Occasionally print the current loss.\n",
        "    print(\"  period %02d : %0.2f\" % (period, training_root_mean_squared_error))\n",
        "    # Add the loss metrics from this period to our list.\n",
        "    training_rmse.append(training_root_mean_squared_error)\n",
        "    validation_rmse.append(validation_root_mean_squared_error)\n",
        "  print(\"Model training finished.\")\n",
        "\n",
        "  # Output a graph of loss metrics over periods.\n",
        "  plt.ylabel(\"RMSE\")\n",
        "  plt.xlabel(\"Periods\")\n",
        "  plt.title(\"Root Mean Squared Error vs. Periods\")\n",
        "  plt.tight_layout()\n",
        "  plt.plot(training_rmse, label=\"training\")\n",
        "  plt.plot(validation_rmse, label=\"validation\")\n",
        "  plt.legend()\n",
        "\n",
        "  print(\"Final RMSE (on training data):   %0.2f\" % training_root_mean_squared_error)\n",
        "  print(\"Final RMSE (on validation data): %0.2f\" % validation_root_mean_squared_error)\n",
        "\n",
        "  return dnn_regressor"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "2QhdcCy-Y8QR",
        "colab_type": "text",
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": [
        " ## T\u00e2che\u00a01\u00a0: Entra\u00eener un mod\u00e8le de r\u00e9seau de neurones\n",
        "\n",
        "**R\u00e9glez les hyperparam\u00e8tres, dans le but de faire passer la valeur RMSE en dessous de 110.**\n",
        "\n",
        "Ex\u00e9cutez le bloc suivant pour entra\u00eener un mod\u00e8le de r\u00e9seau de neurones.\n",
        "\n",
        "Pour rappel, dans l'exercice de r\u00e9gression lin\u00e9aire avec de nombreuses caract\u00e9ristiques, une valeur RMSE d'environ\u00a0110 \u00e9tait plut\u00f4t satisfaisante. Nous allons essayer de faire mieux.\n",
        "\n",
        "Votre objectif est ici de modifier divers param\u00e8tres d'apprentissage afin d'am\u00e9liorer la justesse sur les donn\u00e9es de validation.\n",
        "\n",
        "Le surapprentissage repr\u00e9sente un risque r\u00e9el pour les r\u00e9seaux de neurones. L'\u00e9cart entre la perte sur les donn\u00e9es d'apprentissage et la perte sur les donn\u00e9es de validation est un bon indicateur pour d\u00e9terminer si votre mod\u00e8le commence \u00e0 surapprendre. En r\u00e8gle g\u00e9n\u00e9rale, une augmentation de cet \u00e9cart signale \u00e0 coup s\u00fbr un surapprentissage.\n",
        "\n",
        "Compte tenu de la pl\u00e9thore de param\u00e8tres diff\u00e9rents, il est vivement conseill\u00e9 de prendre des notes \u00e0 chaque tentative afin d'orienter le processus de d\u00e9veloppement.\n",
        "\n",
        "De m\u00eame, lorsque vous avez trouv\u00e9 un param\u00e8tre satisfaisant, ex\u00e9cutez-le \u00e0 plusieurs reprises afin de tester la r\u00e9p\u00e9tabilit\u00e9 du r\u00e9sultat. En r\u00e8gle g\u00e9n\u00e9rale, les pond\u00e9rations de r\u00e9seau de neurones sont initialis\u00e9es sur de petites valeurs al\u00e9atoires. Vous devriez donc constater des diff\u00e9rences entre deux cycles d'ex\u00e9cution.\n"
      ]
    },
    {
      "metadata": {
        "id": "rXmtSW1yKNeK",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "dnn_regressor = train_nn_regression_model(\n",
        "    learning_rate=0.01,\n",
        "    steps=500,\n",
        "    batch_size=10,\n",
        "    hidden_units=[10, 2],\n",
        "    training_examples=training_examples,\n",
        "    training_targets=training_targets,\n",
        "    validation_examples=validation_examples,\n",
        "    validation_targets=validation_targets)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "O2q5RRCKqYaU",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### Solution\n",
        "\n",
        "Cliquez ci-dessous pour afficher une solution."
      ]
    },
    {
      "metadata": {
        "id": "j2Yd5VfrqcC3",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " **REMARQUE\u00a0:** Cette s\u00e9lection de param\u00e8tres est quelque peu arbitraire. Dans le cas pr\u00e9sent, nous avons essay\u00e9 des combinaisons de plus en plus complexes, en allongeant la dur\u00e9e d'apprentissage, jusqu'\u00e0 ce que l'erreur soit inf\u00e9rieure \u00e0 l'objectif. Ce n'est en aucun cas la combinaison id\u00e9ale\u00a0; d'autres peuvent atteindre une valeur RMSE encore plus basse. Si l'objectif est de trouver le mod\u00e8le qui atteint la meilleure erreur, vous devrez utiliser une m\u00e9thode plus rigoureuse\u00a0; une recherche de param\u00e8tres, par exemple."
      ]
    },
    {
      "metadata": {
        "id": "IjkpSqmxqnSM",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "dnn_regressor = train_nn_regression_model(\n",
        "    learning_rate=0.001,\n",
        "    steps=2000,\n",
        "    batch_size=100,\n",
        "    hidden_units=[10, 10],\n",
        "    training_examples=training_examples,\n",
        "    training_targets=training_targets,\n",
        "    validation_examples=validation_examples,\n",
        "    validation_targets=validation_targets)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "c6diezCSeH4Y",
        "colab_type": "text",
        "slideshow": {
          "slide_type": "slide"
        }
      },
      "cell_type": "markdown",
      "source": [
        " ## T\u00e2che\u00a02\u00a0: Effectuer une \u00e9valuation sur les donn\u00e9es de test\n",
        "\n",
        "**V\u00e9rifiez que vos performances de validation restent au m\u00eame niveau sur les donn\u00e9es de test.**\n",
        "\n",
        "D\u00e8s que vous disposez d'un mod\u00e8le satisfaisant, \u00e9valuez-le sur les donn\u00e9es de test pour le comparer aux performances de validation.\n",
        "\n",
        "Pour rappel, les donn\u00e9es de test se trouvent [ici](https://download.mlcc.google.com/mledu-datasets/california_housing_test.csv)."
      ]
    },
    {
      "metadata": {
        "id": "icEJIl5Vp51r",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          },
          "test": {
            "output": "ignore",
            "timeout": 600
          }
        },
        "cellView": "both"
      },
      "source": [
        "california_housing_test_data = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_test.csv\", sep=\",\")\n",
        "\n",
        "# YOUR CODE HERE"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "vvT2jDWjrKew",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### Solution\n",
        "\n",
        "Cliquez ci-dessous pour afficher une solution."
      ]
    },
    {
      "metadata": {
        "id": "FyDh7Qy6rQb0",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Comme c'est le cas du code pr\u00e9sent\u00e9 ci-dessus, il vous suffit de charger le fichier de donn\u00e9es appropri\u00e9, de le pr\u00e9traiter, puis d'appeler predict et mean_squared_error.\n",
        "\n",
        "Notez qu'il n'est pas n\u00e9cessaire de rendre les donn\u00e9es de test al\u00e9atoires, dans la mesure o\u00f9 vous utiliserez tous les enregistrements."
      ]
    },
    {
      "metadata": {
        "id": "vhb0CtdvrWZx",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "california_housing_test_data = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_test.csv\", sep=\",\")\n",
        "\n",
        "test_examples = preprocess_features(california_housing_test_data)\n",
        "test_targets = preprocess_targets(california_housing_test_data)\n",
        "\n",
        "predict_testing_input_fn = lambda: my_input_fn(test_examples, \n",
        "                                               test_targets[\"median_house_value\"], \n",
        "                                               num_epochs=1, \n",
        "                                               shuffle=False)\n",
        "\n",
        "test_predictions = dnn_regressor.predict(input_fn=predict_testing_input_fn)\n",
        "test_predictions = np.array([item['predictions'][0] for item in test_predictions])\n",
        "\n",
        "root_mean_squared_error = math.sqrt(\n",
        "    metrics.mean_squared_error(test_predictions, test_targets))\n",
        "\n",
        "print(\"Final RMSE (on test data): %0.2f\" % root_mean_squared_error)"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    }
  ]
}
